# A Comparison on Different Layout Topologies for Volatile Dynamic Flip-Flops' Feedback Branch

Klaus Holler, Paulo F. Butzen and Raphael M. Brum Departamento de Engenharia Elétrica Universidade Federal do Rio Grande do Sul Av. Osvaldo Aranha, 103 - 90035-190 Porto Alegre, RS, Brazil klaus.holler@ufrgs.br, paulo.butzen@ufrgs.br, brum@ufrgs.br

*Abstract*—This paper presents a comparison between two volatile dynamic D type Flip-Flops' feedback branches. One uses a Tristate Inverter and the other uses an Inverter in series with a Transmission Gate in its place. The work focuses on obtaining the time-characterization of both sequential cells through simulations. The fundamental times are explained, and metrics on how to obtain them are developed. The results showed that, in general, each Flip-Flop behave diferently. Time constrains don't depend on output capacitance, but time arcs do.

*Index Terms*—Volatile D Flip-Flop, time-characterization, CMOS, constrains, delay, topologies, simulation.

### I. INTRODUCTION

Flip-Flops (FF) are one of the simplest and most reliable sequential elements available, especially the FF-D [1]. One of the major advantages of FFs are that they keep the circuit synchronized with the clock and are also able to filter glitches [2]. These properties make FFs essential to almost every modern synchronous circuit [3]. The FF-D is not difficult to use and to comprehend; nevertheless, there are timing constraints that must be respected to avoid critical failures during operation [4]. These constraints are obtained through the time-characterization process. It is through the results of this step that the speed in which a circuit can operate is defined [3]. Thus, the design of a circuit can be optimized through a well-defined time-characterization[4].

There are several FF topologies reported on literature [1]. Each one of them presents their own advantages and disadvantages [4]. However, most of them employ the topology based on two back-to-back dynamic latches (master-slave D type FF) due to its robustness, compactness and energy-efficiency [5]. The logical configuration of this topology can be achieved through different transistor configurations [1]. Two options are to use a Tristate Inverter (TI) in the feedback branch, and the logically equivalent use of an Inverter in series with a Transmission Gate (INV-TG) [1].

Although equivalent in logical behavior, there are electrical and temporal differences in FFs using TI or INV-TG in the feedback branch [1]. To choose the best design can be extremely time-consuming due to the great number of tradeoffs in time, physical area and energy consumption [5]. For these reasons, a detailed and reliable time-characterization is necessary to ease the design burden [2].

This paper focuses on comparing time arcs and time constrains of FF-D regarding TI and INV-TG feedback. A

time-characterization is made through several simulations. In Section II a brief explanation on FF-D is presented, focused on relevant time aspects. Section III describes the techniques developed to obtain these times. Section IV shows the experimental setup and the results obtained. Section V concludes the paper and exposes possibilities of future works.

## II. DYNAMIC FLIP-FLOP

A Flip-Flop is an edge-triggered storage (memory) element that holds a data called token [1]. Dynamic FFs are those that have a clock input, fundamental in sequential circuits [1]. The output of the FF depends on the current input value as well as on the clock signal, it can only sample the input during a transition in the clock level (rise or fall) [2]. When the clock signal rises (or falls) the input (D) value is connected to the output (Q). Further variations in the input don't reflect in the output during the rest of the clock cycle [2]. To better comprehend the sequential cell in question, a set of time arcs and constrains must be considered [2]. In the sequence, those times are explained in detail and evidenced on Fig. 1.

#### A. Time Arcs

a) Clock-to-Q Contamination Delay (Tccq): It is the time the output takes to start changing after the active clock-edge [2]. The Tccq is generally different for rising and falling output values [3]. Therefore, it must be defined for fall (Tccq-F) and rise (Tccq-R).

b) Clock-to-Q Propagation Delay (Tpcq): It is the time the output takes to settle to a final stable value after the active clock-edge [1]. Analogous to the time contamination, Tpcqmust be defined for fall (Tpcq-F) and rise (Tpcq-R) [3]. Tpcqis necessarily greater than Tccq.

c) Transition time (slew): It is the time the output signal takes to rise  $(0\rightarrow 1)$  or fall  $(1\rightarrow 0)$  [3]. It is defined for fall (*Tfall*) and rise (*Trise*). Differently from *Tccq* and *Tpcq*, Transition Times only consider the output terminal [2].

## B. Time Constrains

*a)* Setup Time (Tsetup): It is the time interval before the active clock-edge where the input value must be stable to guarantee the right output signal [1]. However, a too short *Tsetup* will increase the *Tpcq*; hence, the *Tsetup* depends on the *Tpcq* tolerance allowed [4]. *b) Hold Time (Thold):* It is the time interval after the active clock-edge, in which the input value must be stable [1]. Similar to *Tsetup*, *Thold* depends on the *Tpcq* tolerance [4].



Fig. 1. FF time arcs and constrains [1]

## III. WORK PROPOSAL

Initially a series of definitions was necessary in order to perform the time-characterization simulations. The FF topology was chosen for its widespread use in standard cell libraries [5]. This FF is shown in Fig. 2, considering two topologies, INV-TG and TI feedback branches [1]. Both FFs and all simulations were developed considering the same size for all NMOS and all PMOS transistors. Since there are several parameters to vary and compare, a way to make a time-characterization viable was to set size as a constant.



Fig. 2. FF topologies: (a) with TI (b) with INV-TG.

The test-bench was developed with a Piece-Wise Linear (PWL) source for the clock (CLK) signal. The CLK was connected to an Inverter to obtain the Inverted Clock (NCLK). The input (IN) terminal also used a PWL source, IN value varied in specific times to obtain the desired configuration to the FF. Supply voltage was acquired through a continuous voltage source. All voltages used in the test-bench varied within the same range to simplify the results analyses. Different values of capacitance were set in the output to characterize the FFs in

distinct situations. The metrics adopted to acquire the precise time arcs are listed below:

a) Tccq: For Tccq-F it is defined as the time the output takes to rise 0.1% after the active clock-edge achieves half of its voltage. Output slightly rises before falling, so the contamination is considered when the output rises, this way noises and instabilities are not taken in consideration [4]. For Tccq-R an analogous metric is used considering that the output falls 0.1%, as the output slightly falls before rising [4]. IN voltage must be stable long before the active clock-edge.

*b) Tpcq:* For both *Tpcq-F* and *Tpcq-R* it is the time the output takes to achieve half of its voltage after the active clock-edge achieves half of its voltage. IN must be stable long before the active clock-edge.

c) Transition time: For *Tfall* it is the time the output takes to vary from the maximum to the minimum voltage values in the range, without considering signal stability. Analogous metric is used for *Trise* considering the output variation from minimum to maximum voltage values.

For time constrains, another approach was necessary. Tsetup depends on *Thold* and the contrary also applies [3]. Both also affect the *Tpcq* [3]. The most challenging complications appear when one has to define whether a simulation's results refer to Tsetup or Thold. In certain cases, when IN is stable at one value and starts to vary close to the active clock-edge, IN skew may start at one side of the clock-edge and end at the other side. In these cases a valid option to define what is *Thold* and Tsetup is to use a symmetrical pulse centered on the CLK, called Pulse Method . What comes before the pulse middle is considered to be *Tsetup* and what comes after is *Thold*. This solution is reliable to find the time constrains, but the results are usually greater than what would be found in a real situation. Where to center the pulse in the CLK skew is also relevant; therefore two placements were considered: PulseSTR and PulseMID. The Pulse Method is not a time constrain, it is used to obtain them. All simulations must be performed for fall and rise output [4].

d) Tsetup: Initially a Tpcq reference is obtained (Tpcq-R for Tsetup-R and Tpcq-F for Tsetup-F). Output is configured to fall (or rise) again in a next cycle. In this cycle IN is set to start rising as close as possible to the active CLK rising edge keeping Tpcq no greater than 10% from the reference obtained before. After stabilizing in the new value, IN stays stable. Observe that IN rises but the output falls for Tsetup-F and both IN and output rise for Tsetup-R.

*e) Thold:* Same metric used for *Tsetup-F* and *Tsetup-R* is applied for *Thold-F* and *Thold-R* respectively, the only difference is that IN fall, instead of rise, as close as possible to the active CLK rising edge. After stabilizing in the new value, IN stays stable. For *Thold-F* both IN and output fall. For *Thold-R* IN falls and output rises.

f) PulseSTR: Tpcq reference is initially obtained. To obtain PulseSTR a PWL voltage source is parametrized in IN, the source generates a pulse centered in the beginning of the CLK active rising edge and is symmetrical to both sides (trapezoidal format). The pulse is searched to be the

shortest possible keeping *Tpcq* below 10% from the reference. The difference between *PulseSTR-F* and *PulseSTR-R* is the pulse voltage value. For fall the pulse starts at the minimum voltage and rises symmetrically to maximum voltage in both extremities. For rise the pulse starts at maximum voltage and falls symmetrically in both extremities.

g) PulseMID-F and PulseMID-R: Almost identical to PulseSTR-F and PulseSTR-R respectively, the difference is the Pulse being centered in the middle of the active CLK skew.

## IV. EXPERIMENTAL SETUP AND RESULTS

In total 9 simulations were realized for each FF topology. All time arcs were obtained in the first simulation, since they are a phenomena of the FF's regular operation and don't require a search for behave in boundary situations like time constrains. To perform the simulations two programming languages were used, SPICE to create the netlist and testbench of the circuits and MDL to realize the analysis upon the netlist. The simulations were performed by Cadence Spectre Software<sup>TM</sup> that ran the MDL analysis on the netlist. The obtained *time x voltage* graphs were observed through Cadence Virtuoso Software<sup>TM</sup> for simulation success confirmation and debugging. For being a free and open library already established and reliably the NCSU FreePDK 45nm was the technology used in this work. All NMOS transistors had a width (W) of 415nm and length (L) of 50nm and all PMOS transistors had a W of 630nm and L of 50nm. The transistor sizes were based on Nangate Inc. cell design[6]. Each one of the 9 simulations were carried out with 7 different output capacitances. The capacitances were defined as multiples of a single Inverter capacitance, wich was adopted as 0.2fF. The series is as follows: 0.2fF, 0.4fF, 0.8fF, 1.9fF, 3.8fF, 7.6fF and 15.2fF. Time dependent elements, IN and CLK, were defined as PWL voltage sources varying from 0V to 1V. The CLK had a period of 2ns and 0.1ns skew (fall and rise). IN behaved differently in each simulation, with same voltage and skew as the CLK. The supply voltage was a continuous 1V source.

The Inverter, the TI and the TG, the three fundamental components of back-to-back Latch FF, were described as SPICE sub-circuits. Using these sub-circuits FF-TI and FF-INV-TG were described. The SPICE description of the FFs and test-bench was almost equal for all simulations, the difference resided on IN behavior. All simulations were transient. No measurements were realized in the first CLK cycle, the output stabilized in this cycle. The simulations applied the metrics from Section III. The MDL tools for each simulation are exposed in sequence. The obtained results are discussed and exposed in tables when necessary.

a) First simulation (Tccq-F, Tccq-R, Tpcq-F, Tpcq-R, Tfall, Trise): To obtain Tpcq F and R and Tccq F and R the *deltax* function was used. For *Tfall* the *falltime* function was used and similarly *risetime* for *Trise*. One CLK cycle was used to perform all rise measurements and other one for fall.

Comparing Table I results it is perceptible that there is no great difference in using TI or INV-TG. Even so, the results slightly show that Tccq-R and Tpcq-R are faster for INV-TG,

and as capacitance rises Tccq-F become faster than TI. For TI *Tfall* and *Trise* and faster and *Tpcq*-F becomes faster than INV-TG as the capacitance rises. All fall times are faster than the equivalent rising ones.

TABLE I Time Arcs Results<sup>1</sup>

| Cap (fF)                   | 0.2    | 0.4    | 0.8    | 1.9    | 3.8    | 7.6    | 15.2   |  |
|----------------------------|--------|--------|--------|--------|--------|--------|--------|--|
| TCCQ-F (TI)                | 65.87  | 66.11  | 66.60  | 67.91  | 69.73  | 71.85  | 74.63  |  |
| TCCQ-F (INV-TG)            | 65.90  | 66.13  | 66.60  | 67.83  | 69.50  | 71.67  | 74.40  |  |
| TCCQ-R (TI)                | 92.54  | 92.72  | 93.06  | 94.16  | 96.07  | 98.21  | 101.82 |  |
| TCCQ-R (INV-TG)            | 91.36  | 91.54  | 91.92  | 92.99  | 94.75  | 97.15  | 100.61 |  |
| TPCQ-F (TI)                | 129.50 | 130.79 | 133.10 | 138.50 | 146.69 | 160.86 | 188.31 |  |
| TPCQ-F (INV-TG)            | 129.29 | 130.69 | 133.20 | 139.15 | 147.88 | 162.64 | 190.12 |  |
| TPCQ-R (TI)                | 158.85 | 160.19 | 162.67 | 168.55 | 177.43 | 194.02 | 226.46 |  |
| TPCQ-R (INV-TG)            | 157.51 | 158.93 | 161.53 | 167.76 | 177.21 | 194.02 | 226.46 |  |
| Tfall (TI)                 | 14.15  | 15.47  | 18.06  | 24.86  | 36.44  | 60.21  | 109.12 |  |
| Tfall (INV-TG)             | 14.94  | 16.50  | 19.51  | 26.80  | 38.51  | 62.05  | 110.17 |  |
| Trise (TI)                 | 16.46  | 18.03  | 20.99  | 29.12  | 43.58  | 73.58  | 135.38 |  |
| Trise (INV-TG)             | 16.90  | 18.56  | 21.84  | 30.43  | 44.92  | 74.71  | 136.12 |  |
| Values in nicescoonds (no) |        |        |        |        |        |        |        |  |

Values in picoseconds (ps).

b) Second simulation (Tsetup-F): The output rises in the second CLK cycle and then falls in the third cycle. Tpcq-F is obtained. The output rises in the fourth cycle and falls again in the fifth cycle. Tpcq-F is obtained again. To find the shortest Tsetup-F that will keep Tpcq-F from the fifth cycle below 110% from the reference (third cycle) the *bisection* method is applied with 110% from the reference being the stop condition. The *binary search* function realizes the *bisection* method till the stop condition is reached, with tolerance of 1ps. Since IN and CLK have equal skew, Tsetup-F is the time CLK reaches 1V minus the time IN reaches 1V.

The simulation showed that TI has a time of 28.125ps and INV-TG of 35.9375ps for all capacitances. *Tsetup-F* doesn't change with the increase of the capacitance. TI is shorter.

c) Third simulation (Tsetup-R): The output rises in the second CLK cycle. Tpcq-R is obtained. The output falls in the third cycle and rises again in the fourth cycle. Tpcq-R is obtained again. To find the shortest Tsetup-R that will keep Tpcq-R from the fourth cycle below 110% from the reference the second simulation method is applied.

The simulation showed that TI has a time of 87.5ps and INV-TG of 96.875ps for all capacitances. *Tsetup-R* doesn't change with the increase of the capacitance. TI is shorter than INV-TG. Tsetup-R is significantly greater than Tsetup-F.

*d)* Fourth simulation (Thold-F): Same method used in the second simulation is applied. But this time IN falls as close as possible to the fifth CLK rising edge. Thold-F is the time IN starts to fall minus the time CLK reaches 1V.

The simulation showed that both TI and INV-TG have a time of -234.38ps for all capacitances. *Thold-F* doesn't change with the increase of capacitance. There is no difference in using TI or INV-TG. *Thold-F* is negative because IN starts to fall before the CLK rising edge.

*e) Fifth simulation (Thold-R):* Same method used in the third simulation is applied but with the *Thold* considerations from the fourth simulation.

The simulation showed that TI has a time of -187.5ps and INV-TG of -195.3ps for all capacitances. Thold-R doesn't change with the increase of capacitance. TI is shorter than INV-TG. Thold-R is negative because the input falls before the CLK rising edge.

*f)* Sixth simulation (PulseSTR-F): Same method used in the second simulation is applied till the fifth cycle. In the SPICE testbench, the PWL IN is set in function of a variable that is initially 1ps for both sides of the CLK skew start in the fifth cycle. A binary search is used in MDL to search this variable, with 1ps tolerance, till *Tpcq-F* is below 110% from the reference.

According to Table II the time difference between both topologies is small, but INV-TG needs a smaller pulse than TI as the capacitance rises. Both are equal for small capacitance.

TABLE II PULSESTR-F LENGTH RESULTS

| Cap (fF)    | 0.2   | 0.4   | 0.8   | 1.9   | 3.8   | 7.6   | 15.2  |
|-------------|-------|-------|-------|-------|-------|-------|-------|
| TI (ps)     | 48.83 | 48.83 | 50.78 | 52.73 | 54.68 | 58.58 | 70.29 |
| INV-TG (ps) | 48.83 | 48.83 | 48.83 | 50.78 | 50.78 | 54.68 | 66.39 |

*g)* Seventh simulation (*PulseSTR-R*): Same method used in the third simulation is applied till the fourth cycle. From this point on, the pulse metric from sixth simulation is applied.

According to Table III TI is shorter than INV-TG. *PulseSTR-R* is significantly shorter than *PulseSTR-F*. The simulation for 15.2fF should be disregarded as the results were out of an acceptable tolerance.

TABLE III PulseSTR-R Length Results

| Cap (fF)    | 0.2   | 0.4   | 0.8   | 1.9   | 3.8   | 7.6   | 15.2 |
|-------------|-------|-------|-------|-------|-------|-------|------|
| TI (ps)     | 15.66 | 15.66 | 15.66 | 15.66 | 15.66 | 17.61 | -    |
| INV-TG (ps) | 19.56 | 19.56 | 19.56 | 19.56 | 19.56 | 21.51 | -    |

*h)* Eighth simulation (*PulseMID-F*): Exactly the same method described in the sixth simulation, the difference resides at where the pulse begins. This time the pulse begins when the active CLK skew is at middle voltage, 0.5V.

According to Table IV, although the time difference between both topologies is small, INV-TG needs a smaller pulse than TI. The pulse at the middle of the CLK skew is greater than its equivalent at the CLK skew start.

TABLE IV PulseMID-F Length Results

| Cap (fF)    | 0.2    | 0.4    | 0.8    | 1.9    | 3.8    | 7.6    | 15.2   |
|-------------|--------|--------|--------|--------|--------|--------|--------|
| TI (ps)     | 150.29 | 150.29 | 152.24 | 152.24 | 156.14 | 158.09 | 169.80 |
| INV-TG (ps) | 148.34 | 148.34 | 150.29 | 150.29 | 152.24 | 156.14 | 165.90 |

*i)* Nineth simulation (PulseMID-R): Exactly the same method described in the seventh simulation, the difference resides at where the pulse begins. This time the pulse begins when the active CLK skew is at middle voltage, 0.5V.

According to Table V although the time difference between both topologies is small, TI needs a smaller pulse than INV-TG. The pulse at the middle of the CLK skew is greater than its equivalent at the CLK skew start. PulseMID-R is significantly shorter than PulseMID-F. The last simulation, for 15.2fF should be disregarded as the results were out of an acceptable tolerance.

TABLE V PulseMID-R Length Results

| Cap (fF)    | 0.2   | 0.4   | 0.8   | 1.9   | 3.8   | 7.6   | 15.2 |
|-------------|-------|-------|-------|-------|-------|-------|------|
| TI (ps)     | 68.34 | 68.34 | 68.34 | 70.29 | 72.24 | 71.14 | -    |
| INV-TG (ps) | 82.00 | 83.95 | 83.95 | 83.95 | 85.90 | 89.80 | _    |

## V. FINAL CONSIDERATIONS

Considering the results it is noticeable that the INV-TG and the TI feedback branch present each one their own advantages and disadvantages. *Tsetup* and *Thold* don't depend on the output capacitance and TI tend to have smaller times. All time arcs and constrains increase as the output capacitance increases. All fall times are considerably faster than their equivalent rise times. A worthy consideration is that the FFs sample IN more efficiently at the beginning of the active CLK skew. Congregating all results interpretations, a valid and reliable time-characterization was developed. It is fundamental to highlight that all results and interpretations are only valid considering the testbench described, where the transistors sizes and signal skews are all equal.

Based on the present paper a series of future works are viable. Several improvements could be made upon the simulations. Buffers could be used in IN and CLK to simulate a more realistic environment. The output capacitance values could vary more, obtaining more expressive time differences for each one. Instead of 10% tolerance for *Tpcq* regarding the reference a 5% tolerance could be used, representing a stricter operation. Two other deeper improvements are to repeat the simulations also considering different transistor sizes and voltage skews for IN and CLK. These improvements will guarantee results valid for almost all FF-D implementations.

#### **ACKNOWLEDGEMENTS**

The authors would like to acknowledge the support of the National Council for Scientific and Technological Development (CNPq) and the State of Rio Grande do Sul Research Foundation (FAPERGS).

#### References

- N. Weste and D. Harris, CMOS VLSI Design: A Circuits and Systems Perspective. Pearson/Addison-Wesley, 2005.
- [2] R. M. Brum, "On the design of hybrid cmos and magnetic memories, with applications to reconfigurable architectures," Ph.D. dissertation, University of Montpellier, Montpellier, France, 2014.
- [3] G. Neuberger, "Protecting digital circuits against hold time violations due to process variations," Ph.D. dissertation, UFRGS, Porto Alegre, Brazil, 2007.
- [4] D. Markovic, B. Nikolic, and R. Brodersen, "Analysis and design of lowenergy flip-flops," in *Proceedings of the 2001 international symposium* on Low power electronics and design, 2001, pp. 52–55.
- [5] V. Stojanovic and V. G. Oklobdzija, "Comparative analysis of masterslave latches and flip-flops for high-performance and low-power systems," *IEEE Journal of solid-state circuits*, vol. 34, no. 4, pp. 536–548, 1999.
- [6] I. NanGate, "Nangate 45nm open cell library," 2009.